  DBCSR is a library designed to efficiently perform sparse 
matrix-matrix multiplication, among other operations. 
  It is MPI and OpenMP parallel and can exploit Nvidia and AMD GPUs via 
CUDA and HIP.
  DBCSR was developed as a part of CP2K, where it provides core 
functionality for linear scaling electronic structure theory. It is 
now released as a standalone library for integration in other projects.

This requires a MPI implementation, however the package isn't working
with mpich. Use openmpi instead.

* HIP and OpenCL still experimental
